Using subcategorization frames to improve French probabilistic parsing

نویسندگان

  • Anthony Sigogne
  • Matthieu Constant
چکیده

This article introduces results about probabilistic parsing enhanced with a word clustering approach based on a French syntactic lexicon, the Lefff (Sagot, 2010). We show that by applying this clustering method on verbs and adjectives of the French Treebank (Abeillé et al., 2003), we obtain accurate performances on French with a parser based on a Probabilistic ContextFree Grammar (Petrov et al., 2006).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic extraction of subcategorization frames for French

This paper describes the integration of corpus-based syntactic subcategorization frames into a large-scale, theory-neutral lexical resource for French (Romary et al. (2004)). This database is the first to implement the Lexical Markup Framework (LMF), an international initiative towards ISO standards for lexical databases (ISO TC 37/SC 4). The subcategorization frames have been acquired via a de...

متن کامل

From the corpus to the lexicon: the example of data models for verb subcategorization

This paper describes the integration of corpus-based syntactic subcategorization frames and correlated semantic information into a large-scale, cross-theoretically informed lexical database for French (Romary et al. (2004)). This database is the first to implement the Lexical Markup Framework (LMF), an international initiative towards ISO standards for lexical databases (ISO TC 37/SC 4). The su...

متن کامل

Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French

This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The ...

متن کامل

Integrating Selectional Constraints and Subcategorization Frames in a Dependency Parser

Statistical parsers are trained on treebanks that are composed of a few thousand sentences. In order to prevent data sparseness and computational complexity, such parsers make strong independence hypotheses on the decisions that are made to build a syntactic tree. These independence hypotheses yield a decomposition of the syntactic structures into small pieces, which in turn prevent the parser ...

متن کامل

Enhancing FreeLing Rule-Based Dependency Grammars with Subcategorization Frames

Despite the recent advances in parsing, significant efforts are needed to improve the current parsers performance, such as the enhancement of the argument/adjunct recognition. There is evidence that verb subcategorization frames can contribute to parser accuracy, but a number of issues remain open. The main aim of this paper is to show how subcategorization frames acquired from a syntactically ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012